Golay Code Transformations for Ensemble Clustering in Application to Medical Diagnostics

نویسندگان

  • Faisal Alsaby
  • Simon Berkovich
چکیده

Clinical Big Data streams have accumulated largescale multidimensional data about patients’ medical conditions and drugs along with their known side effects. The volume and the complexity of this Big Data streams hinder the current computational procedures. Effective tools are required to cluster and systematically analyze this amorphous data to perform data mining methods including discovering knowledge, identifying underlying relationships and predicting patterns. This paper presents a novel computation model for clustering tremendous amount of Big Data streams. The presented approach is utilizing the error-correction Golay Code. This clustering methodology is unique. It outperforms all other conventional techniques because it has linear time complexity and does not impose predefined cluster labels that partition data. Extracting meaningful knowledge from these clusters is an essential task; therefore, a novel mechanism that facilitates the process of predicting patterns and likelihood diseases based on a semi-supervised technique is presented. Keywords—medical Big Data; clustering; machine learning; pattern recognition; prediction tool; Big Data classification; Golay

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new ensemble clustering method based on fuzzy cmeans clustering while maintaining diversity in ensemble

An ensemble clustering has been considered as one of the research approaches in data mining, pattern recognition, machine learning and artificial intelligence over the last decade. In clustering, the combination first produces several bases clustering, and then, for their aggregation, a function is used to create a final cluster that is as similar as possible to all the cluster bundles. The inp...

متن کامل

The ensemble clustering with maximize diversity using evolutionary optimization algorithms

Data clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic clustering methods, most studies today are guided by clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also...

متن کامل

Weighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering

Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...

متن کامل

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Symbol-by-symbol APP decoding of the Golay code and iterative decoding of concatenated Golay codes

An efficient coset based symbol-by-symbol soft-in/soft-out APP decoding algorithm is presented for the Golay code. Its application in the iterative decoding of concatenated Golay codes is examined.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015